LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence
نویسندگان
چکیده
منابع مشابه
Adaptive Rates of Convergence in Active Learning
We study the rates of convergence in classification error achievable by active learning in the presence of label noise. Additionally, we study the more general problem of active learning with a nested hierarchy of hypothesis classes, and propose an algorithm whose error rate provably converges to the best achievable error among classifiers in the hierarchy at a rate adaptive to both the complex...
متن کاملMonte-Carlo Planning: Theoretically Fast Convergence Meets Practical Efficiency
Popular Monte-Carlo tree search (MCTS) algorithms for online planning, such as ε-greedy tree search and UCT, aim at rapidly identifying a reasonably good action, but provide rather poor worst-case guarantees on performance improvement over time. In contrast, a recently introduced MCTS algorithm BRUE guarantees exponential-rate improvement over time, yet it is not geared towards identifying reas...
متن کاملConvergence rates for adaptive finite elements
In this article we prove that it is possible to construct, using newestvertex bisection, meshes that equidistribute the error in H-norm, whenever the function to approximate can be decomposed as a sum of a regular part plus a singular part with singularities around a finite number of points. This decomposition is usual in regularity results of Partial Differential Equations (PDE). As a conseque...
متن کاملSuper-convergence: Very Fast Training of Residual Networks Using Large Learning Rates
In this paper, we show a phenomenon, which we named “super-convergence”, where residual networks can be trained using an order of magnitude fewer iterations than is used with standard training methods. One of the key elements of superconvergence is training with cyclical learning rates and a large maximum learning rate. Furthermore, we present evidence that training with large learning rates im...
متن کاملSuper-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates
In this paper, we show a phenomenon where residual networks can be trained using an order of magnitude fewer iterations than is used with standard training methods, which we named “superconvergence.” One of the key elements of super-convergence is training with cyclical learning rates and a large maximum learning rate. Furthermore, we present evidence that training with large learning rates imp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied Intelligence
سال: 2020
ISSN: 0924-669X,1573-7497
DOI: 10.1007/s10489-020-01892-0